Automatic speech recognition (ASR) is increasingly important. ASR enables users to interact with electronic devices using their voices. However, ASR accuracy also depends on the quality of the voice input signals and signal processing. Like any computational process, sending “garbage in” to an ASR service will result in “garbage out.” For example, the audio signal received at a microphone will often be a combination of a user's speech and background noise from any number of other audio sources.
Hardware manufacturers wishing to provide voice control face significant engineering challenges. Some manufacturers struggle to select and configure various components needed for voice control, ranging from microphones to digital signal processors (DSPs). These manufacturers may expend substantial resources and time to cobble together a sub-optimal solution, while other manufacturers choose to leave out voice control entirely.
Overview
The present disclosure relates generally to improving audio processing using an intelligent microphone and, more particularly, to techniques for processing audio received at a microphone with integrated analog-to-digital conversion, digital signal processing, acoustic source separation, and for further processing by a speech recognition system. Embodiments of the present disclosure include intelligent microphone systems designed to collect and process high-quality audio input efficiently. Systems and method for audio processing using an intelligent microphone include an integrated package with one or more microphones, analog-to-digital converters (ADCs), digital signal processors (DSPs), source separation modules, memory, and automatic speech recognition. Systems and methods are also provided for audio processing using an intelligent microphone that includes a microphone array and uses a preprogrammed audio beamformer calibrated to the included microphone array. A highly integrated product facilitates rolling out speech recognition and voice control systems, as well as offer additional improvement to audio processing when the features are integrated in a single product.